Improving Classification-Based Natural Language Understanding with Non-Expert Annotation

نویسندگان

  • Fabrizio Morbini
  • Eric Forbell
  • Kenji Sagae
چکیده

Although data-driven techniques are commonly used for Natural Language Understanding in dialogue systems, their efficacy is often hampered by the lack of appropriate annotated training data in sufficient amounts. We present an approach for rapid and cost-effective annotation of training data for classification-based language understanding in conversational dialogue systems. Experiments using a webaccessible conversational character that interacts with a varied user population show that a dramatic improvement in natural language understanding and a substantial reduction in expert annotation effort can be achieved by leveraging non-expert annotation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of telicity using cross-linguistic annotation projection

This paper addresses the automatic recognition of telicity, an aspectual notion. A telic event includes a natural endpoint (she walked home), while an atelic event does not (she walked around). Recognizing this difference is a prerequisite for temporal natural language understanding. In English, this classification task is difficult, as telicity is a covert linguistic category. In contrast, in ...

متن کامل

Fast semi-automatic semantic annotation for spoken dialog systems

This paper describes a bootstrapping methodology for semi– automatic semantic annotation of a “mini–corpus” that is conventionally annotated manually to train an initial parser used in natural language understanding (NLU) systems. We propose to cast the problem of semantic annotation as a classification problem: each word is assigned a unique set of semantic tag(s) and/or label(s) from the univ...

متن کامل

Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon’s Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual en...

متن کامل

Improving Classification of Natural Language Answers to ITS Questions with Item-Specific Supervised Learning

In a natural language intelligent tutoring system, improving assessment of student input is a challenge that typically requires close collaboration between domain experts and NLP experts. This paper proposes a method for building small item-specific classifiers that would allow a domain expert author to improve quality of assessment for student input, through supervised tagging of a small numbe...

متن کامل

On Designing Controlled Natural Languages for Semantic Annotation

Manual semantic annotation is a complex and arduous task both time-consuming and costly often requiring specialist annotators. (Semi)-automatic annotation tools attempt to ease this process by detecting instances of classes within text and relationships between classes, however their usage often requires knowledge of Natural Language Processing(NLP) and/or formal ontological descriptions. This ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014